Succinct Index for Dynamic Dictionary Matching

نویسندگان

  • Wing-Kai Hon
  • Tak Wah Lam
  • Rahul Shah
  • Siu-Lung Tam
  • Jeffrey Scott Vitter
چکیده

In this paper we revisit the dynamic dictionary matching problem, which asks for an index for a set of patterns P1, P2, . . . , Pk that can support the following query and update operations efficiently. Given a query text T , we want to find all the occurrences of of these patterns; furthermore, as the set of patterns may change over time, we also want to insert or delete a pattern. The major contribution of this paper is the first succinct index for dynamic dictionary matching. Prior to our work, the most compact index is given by Chan et al. (2007), which is based on the compressed suffix arrays (Grossi and Vitter (2005) and Sadakane (2003)) and the FM-index (Ferragina and Manzini (2005)), and it requires O(nσ) bits where n is the total length of patterns and σ is the alphabet size. We develop a dynamic succinct index using a different (and simpler) paradigm based on suffix sampling. The new index not only improves the space complexity to (1 + o(1))n log σ+O(k log n) bits, but also the time complexity of the query and update operations. Specifically, the query and update operations respectively take O(|T | log n+ occ) and O(|P | log σ + log n) times, where occ is the number of occurrences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applications of Succinct Dynamic Compact Tries to Some String Problems

The dynamic compact trie is a fundamental data structure for a wide range of string processing problems. In this paper, we report our recent work on succinct dynamic compact tries that stores a set of strings of total length n in O(n log σ) space supporting pattern matching and insert/delete operations in O((|P |/α)f(n)) time, where P is a pattern string, α = Θ(logσ n), and f(n) = O((log logn) ...

متن کامل

Design of Practical Succinct Data Structures for Large Data Collections

We describe a set of basic succinct data structures which have been implemented as part of the Succinct library, and applications on top of the library: an index to speed-up the access to collections of semi-structured data, a compressed string dictionary, and a compressed dictionary for scored strings which supports top-k prefix matching.

متن کامل

Succinct Online Dictionary Matching with Improved Worst-Case Guarantees

In the online dictionary matching problem the goal is to preprocess a set of patterns D = {P1, . . . , Pd} over alphabet Σ, so that given an online text (one character at a time) we report all of the occurrences of patterns that are a suffix of the current text before the following character arrives. We introduce a succinct Aho-Corasick like data structure for the online dictionary matching pro...

متن کامل

A Framework for Dynamic Parameterized Dictionary Matching

Two equal-length strings S and S′ are a parameterized-match (p-match) iff there exists a one-toone function that renames the characters in S to those in S′. Let P be a collection of d patterns of total length n characters that are chosen from an alphabet Σ of cardinality σ. The task is to index P such that we can support the following operations: search(T ): given a text T , report all occurren...

متن کامل

Dynamic 2D Dictionary Matching in Small Space

The dictionary matching problem preprocesses a set of patterns and finds all occurrences of each of the patterns in a text when it is provided. We focus on the dynamic setting, in which patterns can be inserted to and removed from the dictionary, without reprocessing the entire dictionary. This article presents the first algorithm that performs dynamic dictionary matching on two-dimensional dat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009